Mike Schmit's Top Ten Rules for Pairing Pentium Instructions
--- Instruction pairing rules for Intel Pentium Processor ---
1. Both instructions must be simple
2. Shifts or rotates can only pair in the U pipe
SHL, SHR, SAL, SAR, ROL, ROR, RCL or RCR
3. ADC and SBB can only pair in the U pipe
4. JMP, CALL and Jcc can only pair in the V pipe.
Jcc = jump on condition code
5. Neither instruction can contain BOTH a displacement and an immediate
operand. For example:
mov [bx+2], 3 ; 2 is a displacement, 3 is immediate
mov mem1, 4 ; mem1 is a displacement, 4 is immediate
6. Prefixed instructions can only pair in the U pipe. This includes
extended instructions that start with 0Fh except for the special
case of the 16-bit conditional jumps of the 386 and above. Examples
of prefixed instructions:
mov ES:[bx],
mov eax, [si] ; 32-bit operand in 16-bit code segment
mov ax, [esi] ; 16-bit operand in 32-bit code segment
7. The U pipe instruction must be only 1 byte in length or it will
not pair until the second time it executes from the cache.
8. There can be no read-after-write or write-after-write register
dependencies between the instructions except for special cases for
the flags register and the stack pointer (rules 9 and 10)
mov ebx, 2 ; writes to EBX
add ecx, ebx ; reads EBX and ECX, writes to ECX
; EBX is read after being written, no pairing
mov ebx, 1 ; writes to EBX
mov ebx, 2 ; writes to EBX
; write after write, no pairing
9. The flags register exception allows an ALU instruction to be paired
with a Jcc even though the ALU instruction writes the flags and Jcc
reads the flags. For example:
cmp al, 0 ; CMP modifies the flags
je addr ; JE reads the flags, but pairs
dec cx ; DEC modifies the flags
jnz loop1 ; JNZ reads the flags, but pairs
10. The stack pointer exception allows two PUSHes or two POPs to
be paired even though they both read and write to the SP (or ESP)
register
push eax ; ESP is read and modified
push ebx ; ESP is read and modified, but still pairs
--- Simple Instructions (for Pentium pairing) ---
The following is a list of simple instructions, as required by rule
#1 above.
Instruction format 16-bit example 32-bit example
------------------------------------------------------------
MOV reg, reg mov ax, bx mov eax, edx
MOV reg, mem mov ax, [bx] mov eax, [edx]
MOV reg, imm mov ax, 1 mov eax, 1
MOV mem, reg mov [bx], ax mov [edx], eax
MOV mem, imm mov [bx], 1 mov [edx], 1
alu reg, reg add ax, bx cmp eax, edx
alu reg, mem add ax, [bx] cmp eax, [edx]
alu reg, imm add ax, 1 cmp eax, 1
alu mem, reg add [bx], ax cmp [edx], eax
alu mem, imm add [bx], 1 cmp [edx], 1
where alu = add, adc, and, or, xor, sub, sbb, cmp, test
INC reg inc ax inc eax
INC mem inc var1 inc [eax]
DEC reg dec bx dec ebx
DEC mem dec [bx] dec var2
PUSH reg push ax push eax
POP reg pop ax pop eax
LEA reg, mem lea ax, [si+2] lea eax, [eax+4*esi+8]
JMP near jmp label jmp lable2
CALL near call proc call proc2
Jcc near jz lbl jnz lbl2
where Jcc = ja, jae, jb, jbe, jg, jge, jl, jle, je, jne, jc, js,
jnp, jo, jp, jnbe, jnb, jnae, jna, jnle, jnl, jnge,
jng, jz, jnz, jnc, jns, jpo, jno, jpe
NOP nop nop
shift reg, 1 shl ax, 1 rcl eax, 1
shift mem, 1 shr [bx], 1 rcr [ebx], 1
shift reg, imm sal ax, 2 rol esi, 2
shift mem, imm sar ax, 15 ror [esi], 31
where shift = shl, shr, sal, sar, rcl, rcr, rol, ror
--- Notes ---
- rcl and rcr are not pairable with immediate counts other than 1
- all memory-immediate (mem, imm) instructions are not pairable with
a displacement in the memory operand
- instructions with segment registers are not pairable